As this is the very first exercise in this workshop it is not that hard. Its purpose is also to get used to this exercise format and, more importantly, to get first feeling for working in the tidyverse.

Just two short notes on working with the exercise files in this workshop:

  1. We would like to ask you to solve all tasks by writing them into your own R script files. This ensures that all of your solutions are reproducible, and that you can (re-)use solutions from earlier exercises in later ones.

  2. All exercises and their solutions ‘assume’ they are in the ./solutions folder of this repository. This way they can make use of files in other folders using relative paths. In order for your scripts to run properly, we suggest that you create (and save) them either in the exercises or solutions folder and set your working directory for the exercises accordingly (you can check your working directory with getwd() and change it with setwd()).

Again, the following exercise is really short and just supposed to let you play around with some pipes and tibbles. It’s a mini-exercise!

First things first: To work with the ‘tidyverse’, we have to have access to its packages.

1

Load the tidyverse library.
If the tidyverse library has not been installed yet, you can install it with the command install.packages("tidyverse").
if (!require(tidyverse)) install.packages("tidyverse")
library(tidyverse)

After successfully loading the tidyverse library we turn to the magic world of pipes. Remember, pipes are a convenient way to disentangle nested R functions and to write cleaner R code. First, have a look at the code in the following block:

mean(sqrt(as.numeric(read.csv2("../data/titanic/titanic.csv", sep = ",")$Fare)))

2

What do you think is the command doing?
It’s always a good approach to start reading/interpreting code from the inner command and continue to the outer ones.
  1. The titanic data are imported with read.csv2()
  2. Only the Fare variable is extracted using the $ operator
  3. The variable is converted to the numeric format with as.numeric()
  4. A square root transformation is applied with sqrt()
  5. The mean is calculated with mean()

Using the commands in such a way makes the code somewhat difficult to read and understand. You have already learned that pipes provide a straightforward approach to address this issue.

3

Create a pipe from this nested command.
You can call individual columns of a piped object with .$col_name.
read.csv2("../data/titanic/titanic.csv", sep = ",") %>% 
  .$Fare %>% 
  as.numeric() %>% 
  sqrt() %>% 
  mean()
## [1] 10.46045

As we have already learned, the tidyverse is not only about pipes, it’s also about specific formats of data. The default data format in the tidyverse is the tibble format. In the previous task, you have already imported the titanic data, but it is in the standard data.frame format.

4

Load the titanic dataset and convert it to a tibble.
base-R’s read.csv2() is your friend. Also, you may want to do it all at once in one pipe.
titanic_tibble <-
  read.csv2("../data/titanic/titanic.csv", sep = ",") %>% 
  as_tibble()

Now, look at the following data.frame. It’s been created with the standard base-R tools. The tibble package also provides the tribble() command to create small data tables as tibbles from scratch.

##   day amount_coffee words_written
## 1   1             2           245
## 2   2             5           691
## 3   3             1            10
## 4   4             8          2100
## 5   5             4           490

5

Use the tribble()-function to recreate the above dataframe as a tibble.
Remember to define columns names by preceding them with a ~ (tilde).
tribble(
  ~day,  ~amount_coffee,   ~words_written, 
  1,                 2,              245,
  2,                 5,              691,
  3,                 1,               10,
  4,                 8,             2100,
  5,                 4,              490
)
## # A tibble: 5 x 3
##     day amount_coffee words_written
##   <dbl>         <dbl>         <dbl>
## 1     1             2           245
## 2     2             5           691
## 3     3             1            10
## 4     4             8          2100
## 5     5             4           490